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A FUNNY miSG HAPPENED bSvtHE wAS TO TH& PEtellR: • '. ^ ; V 
■ ^ . • THB SAeA ef DEVEtQP A CUSTOMIZEb'/reST ' - " , ' - 

«ahy school diktr lets are moving away from a rellaQce on n6rm g|ef erep<^e(i;.: 



'ft'' 



rtRhtly so) that the criterion-referenced tes±s"M 

of learn^in^.^^esear^ji j^^ well eSteblished tha t norm-r^f^^^^ 

^f^^fl^^J^^^^^'^^ that the overlap between curri^ 

content affects test scores; Given these somewhat slmple-lninded factlMj^ 
is Clear that the closer tte ma tch be^een w^^ Is taught and wha}^3^j|;^^ 
tested, the more accurate fnd scS)re will be* '^^^ 

Finding the right crl ter|6nTrefe^ence,^ is not, however^ a ^ilmpl^ ^^^^^^^^ 

matter. Whi^ther^ one^ gqea 4?|tfi a test developed fey '•exper^t:^,^^^^^^ 
publishers,' or prefers to adopt a home-grown variety, ;^ere are^^ 
complica tions. What I'd like to do tdday. is .share w 1 th: ybti^ some 
complications we encountered last yfar when we went the Texpe^'t" route. 1 ^ ^"■■ ■'-^^ 
assure' you that every thing la gbltig to be saying to ybu is the^truth; ^ ■^)':'^ ''-^^^ 
have not added anything to increase the dramatic effects ^ V ' 3 

^ Flist, a few^ worJ die context of bur effort. We «er% in the process ^^^^^ V 

of a new reading/ language arts program that had • > 

V f^e^ developeti in our * county. The purpose of the evalUa tibn was to 
deteirmine whether the program was being implemehte planned and whether^ ' 
onc^ implefeente^^ had any effect^ bi/ a'chleveieht or attitudes 

toward iseadixig. The stud^ is a three--year, 16^ effort in which t^ 

groups of studetits are bflpg foll^v^^ group (ster ting in Grade 1) 

^ri^ a later ele men tar:y^rpup^ in Grade 4). Because we did not^t . ■ 

that tim^ do any« end-of-year tes ting in readiti^/ language arts on a - 
■systemwide basis, we wanted a crl terioii-referenced tes_t^^^^ 
administer at thfe end of the school year tct assess the first atiV^^s^tftth 
graders.* end~bf -year skills^" ' f 

>■ ^ • _ y_ . \ ' ^_ ' : v. 'V \ - ^ ■-' 

Oiir first challenge was tb f^ developer t ' 

would accept as be^ln'g good measu^ at these^tijo. grjlde . j^^^ 

levelsi -Thi^^ w^s not an easy task as he had some very definite ideas about< f • ' . 
both what should be measured and how it should be measUjred.'' The fMct that 
criterion-referenced tests had already been developed by the prbgrain 
developer and his staff (tests we did not choose to use both because th^y 

. had been administered earlier in the year and because Uiey been highly 
criticized by some s&hbbl staff) rendered seB^cting ibie outside meas^^^^^ 
even mbre difficult. We brought in Several but groups bo pitch th^lif 
wares, shbw us items, and negotiate dollars. F^nallyi after extetisiye * 
revifew and time, we found a group that appjeared to have a product we could - ^ 

; live withi^- , • ' : ■■ -J'.'--'- • ■ ' ' : -'t '-'' 

The test bank, as i^ was described to us* had sbme uniquely appealing ^ 
features. To start with* if bne included certaiS;! terns (regafdt^ bf their 
curriculum /ma tch) nb t only cbuid tiSe dis trict receive cti teribn-ref erenced 
da taVbut aisb fib^m-ref« Thus, national coaparisons could be 

made as >eil as more local ones. Second, the a^a9f|lable itea bank wai^ rather / 
extensive, as the publisher had taken items frdm the tfiany different ^ests* 
produced by the . company. Third, altJibUgh itemA on tJie te 

arranged into obiective groupings, we were tbld that we could have bur test ' 
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; - • "eUs tofiizi^d^^^^ could create btir own itero_groupinKs and ^ivi Sem_ biir oWb 
: objective names; this was a new fea tore the company was offiering and we 
: i. tecamfe one oi the first clients to take advantage of this options ' 

• /The ne:Kt f ^ can only he desbribed as a "comedy of errors." Having 

^elec%eiF^a n^ex ^p^ 



and ^developmehtal dilemihasi we wound up ehcbuht^rihg one disas ter, af 
another; \^^W|ie|Ser these were caused by unworkab3£e\.ttme co&stratnts of 
incompetent staff, i won't venture to say, L€?t me lay ^oiit the problems we 
encoun tered and leave you to • <l^cide how ybu would allocate the 
; responsibility. V . ' ' , 

/ PBOBUIM op THE LISTENING T^^ ' ^ > ^ 

■ \ .■ • ■ ■ '■ ' . . ' h ' . ■ 

; The 'nor med i terns on the first-grade test were described to us by the local' 
rep an^ her regional manager as orally administered.^ This allowed-^ thg 
^ PGSsibiiity of obtaining measures of text c^^ which were not 

necessarily tied to decpdihg skills. This had a good deal of appeal to us 
^ because biir hew curriculum stressed comprehehsioh at all grade levels and 
; some teacheifs felt that existing tests did hot allow^the stiidcent who was ' 

essehttaily a ^pnfeadef to show his/her listening comprehension skills. In 
^utiding the first--gfad test we decided, ho&ever, to tqcludei a second, * 
nohofkl section in order tb be able to measure reading skills. In-order to 
t upe : they goi^^^^ included the tidtm items* in a listening pdrt;!^ 

some bl-a^^ cbyer tJhe ^ wanted' to test ^ 

and then ^ilt S second^ somewhat parallels "rgadihg" pbr tibh. Accoihplishihg 
ihis wijtjiout bu^^^^^ was too lengthy fbr the firs t ^grader 

^wasnV^j^^s^^^^ but-^ith a gbod deal bf wbfk and negbtlatibn between the 
"^valuation Stid^ ^ deyelopment staff, we got it done. 

Tests ordereuv GOvei:*^^^ selected, we notified the princip whose 
schools were^artlcip^t^^ reading study that the test which h^jd been 

/^^tatively jirbrois^ed^ to becbme a reality. While tjie 

; neSd fbr adjditloh&l .t§^|^ nbt welcbiiiedi the possibility bf getting 

nbrmed- as well aS critef^^^ data^.and the interest in the 

listening measu^re^ ie^emed td':r^Bi|^ in a decision of the pluses outweighing 
the ffiitiuses. They wetB ready; eager. 

. One bit of infbtiiiatibji not iti^ the prepackaged material provided by 

' ' ^th^ puj>iisher was^the ihstriictibhs for admihis terihg the listehing test. 
We h^'^ded to know how much tirn^ was allbwed, :)^^^ 

tje alternative answer chbicesMwere read, etc? Fbr^sbme reasba, the Ibcal 
rep was having problems getting such ^i^ office and 

kept returning, with ^e answ^ that there were no standard directions and 
that, we should deyelop our otm. .We found this response extrem^^ 
and wd^ldered how one cdyld! have hdrm data without standard directions. We 
/began to get very nervbtis. ' , if 

/Then' one night arbuhd six I was hbme having first ji^j^ of the evening 
/ when 1 gpt a Ibng di frbm the main bffice bf the publishefi they - 

/ were finalizing the test and had some questions about &ow^ items we 
/ grouped into subsections and obiectives. In the course of this conversation, 
/ I naturally had occasion td refer to the lis tenihg ^section and the reading 
j sectibn. Tb make a 'Ibng stbry short, and believe me, i#e had many 
/ cbhversatibhs^bh this tbpic in a relatively shbrt period bf time, we f bund ^ 




^put ih^t^ faad been giiTisn ^jMjUt the item^ bellig n^rm 

teit 0a§n't quite tful^ WKat ineatit was Uiat ^i Ite 
wj&re ; ti^ biMi^ orally; fetit tfie tes t Wa^ a|ktah<^ard^ t^st of readinjg at t% 
first grade^ liat a test of listening cbmprehenilon as told; We; 

had c rea tjed a truly c^^ product and ^It was too iat^^^ t^ All' 

we could do V was go aharf-iaiid ^^^c^^ directions for test aiimlniatratii5n, ^nd 
•hopfe-' ' ^4 . ,, '■; . 

PaOBliM' TWO: ^E: TEST INSraUME^TS 

Shortly after .the cbnyersations described above tiie.'test^'.arrived,^ n^^ 
twd^t^ousand of them, ready fbr administra We had assumed we'd be sent 

a gailey f or^ proofing before the actual prihting, but either through 
oveiyBight xjr perceived lack of time, tkis step wag skipped, v 

Th^ first thing we noticed as we unj);acked t^he tist bbpklets >7ai tha the 
c<>vers were whi te# no t the colors wliich we had pa ins td:kitigly chosen. We 
were disappointed^ but we figured wef could live with white cbversv as Ibng 
IS the charge for cblbred covers was/ taken off the bill, 

iNext we opened the bO|Oklets and foutfd ...Pandpra's* box. To start, with, 
despite our pi;eyiojJs conversations, fsbme^ We hadn't 



selected them/ they didn't niatch^ and they couldn't' be used. If tfiat 
wereii't ehbugh, th.^ items; because they had- been taken frbm a variety of 
test Si were In varied type styles! Finally» directions fpr the items were 
incbnsi^teaHy worded ^and in sbme^^ 

IV took us many converl^tions anS displays of temper to get changes mgde in 
the format of the items so that ^they were reaso^ consistent. While the 

publisher's staff !"did not necessarily agree that inconsis tencies in format 
and directions might throw pff^first and fburth graders* ^they finally gave 
in and agreed tp-make as many dhahj^es a^^^^ They alsb agreed tb give 

us tests ,with the items we h^d selected Instead bf bnes. which apparently 
had been develbped b]^ gremlins; We thre'^^ the initial cartons; of te^t 
booklets and waited. This t£^me« facsimiles were telecopied to us before 
printing, so that we could see the instruments. The next batch was what we 
had *drderiad and lodked useabl^ . ; 

PROBLEM THREE: THE ITEM TAPE* - ^ 



We made it through test administration with no new disasters, collected the 
booklets^ and sent them off to be scored. Since the tests were being 
administered for the tirst time and we were not totally sure ofVhether or 
tibt ail the items would be acceptable* we tasked for a tape of item zre^^bhses 
sb we cpuld conduct some p analyses befbre scbre repbrts were 

prbducedfbr individual schools. We had a suspicibn that sbme Items might 
nbt ••work- and that tl^ey would have to be thrbwn but. • 

Items of special concern were ones where many students had selected^an 
incorrect answer. We wanted to be sure :tha t the sj^iidents hadn't been in 
some way thrown off by the wording or structure of the question and that 
finding a high failure rate would provide ins true tibhally useful 
ihfbrmatibh. Wha t we f bund ins tead was that in aonumber bf instances the 
item had been incbrrectly scbred. the gremlins were at it again. When wq 
called the publisher about this problem* we found out that they had found 



Valid carrect^>tJ|i tliei|:;>'tap^ ta dd so pti 

otrrsi RatKec 'th^ to^seid us a new tape (atid all tlie 

; pbssibiiiti^^^^tvcf^ t^e ml^ht open iipj; we went fihead^ made^^ 

/'our owi cotrict v^j^ bur itei anaiysesi \. i ^ / 



Reports: bfysdJSbi^^ performance/ Ini^i'lfided the average percentage of studejits 
cl^ eacbV^bSiec|j^ye 



a.c^^-iS.de.d 

maltecinl? eacb^ as w^]^ averagf^^^ test : 

pirf orms^ a't-fe^ ;f irs^^t^ some^^o tals . 

that-stra^^^ rathef ;pdd^/ -^a^nt^ of cases* store than 106 percent : 

^tiiey0^ e^ch.^^ ■ i .': ^1 . ^^-j 

J^divln^^^^^ nbt^^lJ^y Very jiara. ffhat had happened wWs the./; 

0'fbli^oH^^^^ In grdupihg obiectives into suites ts^ 

an oblScti^^^^ ^£fudfedrbTi t3#b different (The item 
/wasn't acttiSiiY bnf the p ^ice^' wasi u§ed^ twice fbr- scBrini?^ purposes.) 

? ff he meati |>erG^^ had be|ii];ca](t^^ by a program tha^ did not take^ into 

^ 'a^cpUnjt^^^ \t^ ty^/ Thus ; ^fceti the ayeraKS was caicula ted, it was 

iJiv^de'd; b obiectives, Jlot th^ number of bbiec ttves 

' tlm one had looked at the 

repbltts J* before tb|By #er^ shipped, as one wpiild expect ^ 

per^eSta^es^ b 15|^ e^Ci-^lb^ca t^h^ ^ye^ However, since apparently 

V no on^^ey^^ book|^|iis4j»E befbre these were sent, 

shbuids^J^ lieen/surprised^ T - / ^ \ 

NORM DATA 



I: ■ 
■■1 ■ 



The; publishing company offered a variety ^bf ways bf repbrting nbrm data. The 
: V b/PtibSs varied in unit bf analysis and type of score presented (stahtnesi 

grade equivalents, etc.). A certain number bf scores came as part^bf the 
packap;e; others required additional funds. 

-\-:^ When we^xecelyed^ our fourth grade ^s,tudents to the 

natibnal "satople*^ gremlins gbt in the way once more. Instead of 

i :^ 'receiving the four scbres that we had requestedf we received bne Ulat we had 

requested an^ three -that we had Sbb By this time, we ^pretty much expected 
that something would go wrong, ealmiy, we. called and explained. The cbrrect 
reports werej 4uly disp^^tched. ' ^ ^ 

j L OONCtUSION V 

Believe it br nbti after all that* the da ti we gbt from the tests were 
extremely useful. Afid, it turned but that the listening test for first 
graders provided us with some, very- useful Infbrmatibn abbut the listening 
comprehension skills of. students who were nbnreaders. In fact, we 
surprised at how well some students did on what Irad been judged to be, fairly 
' cbmjpflica ted passages. . 

/%bw about our relationship with the publisher. It's a bit sensitive, but 
I we're s till on speaking terms (br a t leaS t were until this AERA 
> i presentatibn)^ We haven't crbssed them/^of f bur lis t^ but we'rire not buying 
\another test this year. F^br a yarie ty^f^reasbns, we are turning tb the 
home-groWn ialternatlve. From wtet's been happening sb far, 1 think I'll be 
/ back next year with a whole new set of stories. 

i» T > ■ ■ ■ . \- ' ■ • . ; 
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